Ecc 

checked 



title 



Microprocessor Crash procedures 
Initialization and Breakpoints 



prefix/class-number, revision 

MPREC/W- 30 




approved 



24/ 



authors 



Butler W. t Lampson 



sutler w. Lan 



fe>— 



approval date 

9/30/69 



revision date 



classification 

Working Paper 



distribution 

Company private 



pages 

8 <? 



ABSTRACT and CONTENTS 

Specifies the actions expected of all microprocessors in 
connection with system crashes and the earliest stages of 
initialization. Describes the uniform microcode which is 
used to handle the breakpoint feature. 
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Introduction 

The basic philosophy of the Ml system with regard to the 
detection and correction of system errors, both hardware and 
software, is that 

1) As much checking as possible should be done for 
conditions which indicate the possibility of a system 
failure 

2) When an error is detected, some standard action should 
be taken which should be reasonably flexible and 

(possibly complex) but which should be able to proceed 
even when most of the hardware is not working 

3) This action will normally include an attempt to obtain 
and record as much information as possible about the 
failure 

4) It will normally also include a complete cleanup of 
the system, so that execution can continue with 

some confidence that any errors in system tables have 

been corrected. 
The ITP is the vehicle for most of the logging and recovery 
procedure, as described below. 
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Hardware Failure Reporting 

The system is equipped with five words worth of bits which are 
set as the result of detected malfunctions in the hardware; 
power failures, overheating, parity errors, etc. These words 
can be read by the UTP; their format is described in SR/S-8 . 
When a word is read it is reset to (although it may imme- 
diately become non-zero if error conditions persist) . The 
union of all the error conditions except those in SWR5 is a 
signal called the system failure level or crash signal , which 
is used to set the UTP's STR0BE2 flip-flop. Other micro- 
processors can also set this flip-flop by directing a STR0BE2 
to the UTP, and are expected to do so whenever they detect 
an error . 

When the UTP sees its STR0BE2, it enters a system recovery 
mode. Its first step in this mode is to send STR0BE2 signals 
to all the other microprocessors. Each processor can test its 
STROBE 2 with a branch condition, and is expected to do this 
as frequently as is convenient, but at least often enough to 
bring all activity to an orderly halt within 500 \±s from 
the moment when the level appears . After bringing things to 
a halt in this way, the processor records any unusual 
conditions it knows about in an error reporting region (ERR) 
in core at a location peculiar to it. The processor then 
dumps its state as for a break and hangs until the break 
wait location (BRKWAIT) , which is another core location pecu- 
liar to it, becomes non-zero. It zeros this location before 
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hanging. When BRKWAIT = 12343210B, it reloads the state 
(exactly as though a break had completed) and starts executing 
at the location given by the break address word BRKADR. 

The CPU follows a slightly different procedure; it treats 
crash exactly like single-step. 
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Recovery Action 

The UTP plays a special role in the handling of failures. 
When it sees the crash signal, instead of going into a wait 
state like the other processors, it turns control over to an 
ITP program which tries to figure out what is going on. This 
is done in a rather cautious fashion, however, to avoid 
getting into trouble if critical parts of the system are not 
working . 

1) First the system warning registers are checked to make 

sure that power to the UTP and core is not about to 
fail. if it is, bits 0-11 in SSL are turned on and 
the processor hangs. 

2) A special microcode sequence is entered which tests 
the memory for reliable operation. if the memory 
proves to be so faulty that an ITP program cannot 
execute, it turns on bits 12-23 in SSL and hangs. If 
it appears the the memory could function in 4-module 
mode, some as yet undefined action is taken. if the 
memory is OK, 

3) Schedule mode is turned off and the ITP reset sequence 
is executed. This will start the ITP running, and it 
can proceed to determine the magnitude of the problem 
and decide what to do about it. 
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System Restart 

As a result of a crash signal, or perhaps for other reasons, 
it may be desirable for the ITP to restart one or all of the 
processors. To this end it has the ability to send a ZM 
signal (clear 0) to any of them. In this way the system restart 
procedure described elsewhere can readily be initiated. 

To send ZM to other processors, the ITP should address device 
UPZM with a POT instruction and transmit a data word which 
has the same form as the word which a microprocessor puts on 
the X bus when it sends a STROBE . 
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Initialization and Breakpoints 

When a microprocessor (other than the CPU) finds itself at 
location 0, it proceeds as follows: 

1) Check for a break (branch condition BREAK=1, #45) . 
If there is a break, store the state at SAVE in the 
following format 

M 

SKI - SKn (n = 31 or 63) 

R0 - R6 

OS 

Q 

Z 
then wait for BRKWAIT to become non-zero. When it 
does, reload the state, take the new value for the 
register from BRKADR and resume execution. 

Code to handle this aspect of initialization is listed 
below. It may be found on (LAMP SON) BREAK. It saves 
M in SK0 and exchanges the scratchpad with core to 
save or restore it. Sending control to LOADST loads 
the state . 

2) If there is no break, wait for a STROBE. 

3) When the STROBE is received, take any special 
initialization action which may be required. 

4) Then load state as if returning from a break. 



^cc 



p/c-».r pafrfc 

MPREC/W-30 \ 6b 



Fixed Core Addresses and Parameters 



Microprocessor numbers N: 



1 = AMC 

2 = UTP 

3 = CHIO 

4 = CPU0 

5 = CPUl 



BRKADR = 20B + (N-l) (not for CPUs) 

BRKWAIT = 23B + (N-l) 

ERR = 2440B + (N-l)*4 

SAVE = 2 500 + (N-1)*120B (not for CPUs) 
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PAGE 1 SYSTEM (HECKEU /BREAK 09/36/69 0647U* 

* PARAMETERS DEFINING THE SAVE ARgA# USED BY THE STATE STORE AND 

* LOAD ROUTINES 

UP $AVEV36a08j . * FOR UTPt SEE SYSP/W.lS 
DP 8AVER0VSAVE+L5CRATCHJ 

* THE CODE -SAVES RQ«R6, 0S# QAND I IN SEQUENCE 

DP SAV|RWSAVERO*iJ 
_..___pp $AV|Z^/SAVER0+9| 

* A SCRATCHPAD LOCATION MUST BE DEDICATED TO SAVING M 

DE SKQ AS SSMR^Qi 

* THE a REGISTER IS RESTORED FR&M 

DP BREAKVaiSJ .. . , # F§R UTPf SEE SYSP/W«l5 

* AND WE. WAIT A^TER ST0RING THE STATE FOR , . 

* -: DP SRKWAJTVS6BJ . * FOR UTP f SEE SYSP/W.i5 

* T0 BECOME NSN-ZERO BEF6RE STARTING T© RELOAD; 

* BREAKt DUMP STATE IN THE SAVE.AREAt . THE STRATEGY IS AS FOLLOWS 

* 1) SAVE M jn SSMREG JN THE SCRATCHPAD 

* Z) STORE MAR AT SAVEROt STORE ©$ AT S£YE©£ ., .. . 
w |T STORE ZsQ AT.SAVEJ#Q AND R1«R6 AT. SAVER! TO SAVE*6 

* 4) EXCHANGE SCRATCHPAD AND SAVE TO SAVE*LSCRATCH«i 

* AFTER SAVING THE STATE* BREAK WAITS UNTILL BRKWAIT 

* BECOMES N8N*ZER0f 

* THE RELOAD STRATEGY IS OBTAINED BY DOING STEPS J»3 In REVERSEj 

* AND THEN 

* &) FETCH SAVEOS ^ND DG0T8 IT 

* i) QG8T0 **i (THIS SETS UP OS)j FETCH BREAK 

* 0) DGOTO M (THIS SETS UP ©># FETCH SAVERO 

* p1) MAR^M # M^SSMREG 

MACRO |MS^MARVZ»/Z + 1# STOREj 
SAVESTJ MARVSAVERO# STBREl 

rvz* mar*/savez# store* 

MVRIi MARVZVSAVER1* STORE* 

M^R2 # JMgJ 

M^R3# IM8I 

M^R^i |M§J 

1*WR5# IM$J 

M^R6# IMS! 

MVOSj JMS | 

M,/Q# MA^Zi/MAR+l* STORED 

MARV$AVE«.J 

R8i/«L$CRATCH#. 2j/0/ CALL XSeRATCHl 
BWAITl MAR^SRkWAJT# FETCH* . . 

GST© «*1 ON MtO« R2*/*ISCRATCHJ 
LOADSTI MAR^$AV£«1, ZVO# CALL XSCRATCMJ 

MARVSAVERli FETCHI 
■*WM, CALL FN; 

R2i/M# CALL FNi 

*3VM# qALL FNj 

R^M, CALL FNi 

*5>/M # CALL FNj 

R6*/M# CALL FN* 
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PAGET T SYSTEM (MECKEL) /BREAK 09/26/69 0647 1 i* 

tfSBTt Hi MARVMAR*!* FETCH* 

QVMj MiR^MAF?*i# FETCH* CALL * + U 

UK* mar*/8reak# fetch/ 

DG&T0 Hi MARVSAVERO* FlTCHi 
?1AR^M# MVSSMREQj 

#" SXJBSWTTNE TO BUMP MAR AND FETCH 
FN J MARVMAR*i_#_ FETCH* RETURN* 

* SUBR5UTJNE T9 EXCHANGE »<R2> SCRATCHPAD L9CATi$NS* SATINS AT \l\* 

* with reprLiCATieNs starting at rw*Ry*it clobbers h#Zjri#r2 

XSCRATCHJ F|TCH# MAR>/HAR*lJ 
"*W5KZl 
SKZVM* MrfRlj DGST8 XSCRATCHJ , 
ST^REj Ul+l* RETURN &N R8*/Ra+l»»0l 



